智能论文笔记

The 6th AI City Challenge

Milind Naphade , Shuo Wang , David C. Anastasiu , Zheng Tang , Ming-Ching Chang , Yue Yao , Liang Zheng , Mohammed Shaiqur Rahman , Archana Venkatachalapathy , Anuj Sharma

分类：计算机视觉

2022-04-21

第六版的AI城市挑战赛特别关注了两个领域的问题，在计算机视觉和人工智能的交集中具有巨大的解锁潜力：智能交通系统（ITS），以及实体和砂浆零售业务。 2022年AI City Challenge的四个挑战赛收到了来自27个国家 /地区254个团队的参与请求。轨道1地址的城市规模多目标多摄像机（MTMC）车辆跟踪。轨道2地址为基于天然语言的车辆轨道检索。 Track 3是一条全新的自然主义驾驶分析的轨道，该轨道是由安装在车辆内部的几台相机捕获的，该摄像头专注于驾驶员安全，而任务是对驾驶员的操作进行分类。 Track 4是另一个旨在仅使用单个视图摄像头实现零售商店自动结帐的新轨道。我们发布了两个基于不同方法的领导董事会成员提交，包括比赛的公共负责人委员会，不允许使用外部数据，以及用于所有提交结果的总管委员会。参与团队的最高表现建立了强大的基线，甚至超过了拟议的挑战赛中的最先进。

translated by 谷歌翻译

COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of Convolutional Neural Networks

Md. Ismail Hossain , Mohammed Rakib , M. M. Lutfe Elahi , Nabeel Mohammed , Shafin Rahman

分类：计算机视觉

2022-12-24

Pruning refers to the elimination of trivial weights from neural networks. The sub-networks within an overparameterized model produced after pruning are often called Lottery tickets. This research aims to generate winning lottery tickets from a set of lottery tickets that can achieve similar accuracy to the original unpruned network. We introduce a novel winning ticket called Cyclic Overlapping Lottery Ticket (COLT) by data splitting and cyclic retraining of the pruned network from scratch. We apply a cyclic pruning algorithm that keeps only the overlapping weights of different pruned models trained on different data segments. Our results demonstrate that COLT can achieve similar accuracies (obtained by the unpruned model) while maintaining high sparsities. We show that the accuracy of COLT is on par with the winning tickets of Lottery Ticket Hypothesis (LTH) and, at times, is better. Moreover, COLTs can be generated using fewer iterations than tickets generated by the popular Iterative Magnitude Pruning (IMP) method. In addition, we also notice COLTs generated on large datasets can be transferred to small ones without compromising performance, demonstrating its generalizing capability. We conduct all our experiments on Cifar-10, Cifar-100 & TinyImageNet datasets and report superior performance than the state-of-the-art methods.

translated by 谷歌翻译

Performance Analysis of YOLO-based Architectures for Vehicle Detection from Traffic Images in Bangladesh

Refaat Mohammad Alamgir , Ali Abir Shuvro , Mueeze Al Mushabbir , Mohammed Ashfaq Raiyan , Nusrat Jahan Rani , Md. Mushfiqur Rahman , Md. Hasanul Kabir , Sabbir Ahmed

分类：计算机视觉

2022-12-18

The task of locating and classifying different types of vehicles has become a vital element in numerous applications of automation and intelligent systems ranging from traffic surveillance to vehicle identification and many more. In recent times, Deep Learning models have been dominating the field of vehicle detection. Yet, Bangladeshi vehicle detection has remained a relatively unexplored area. One of the main goals of vehicle detection is its real-time application, where `You Only Look Once' (YOLO) models have proven to be the most effective architecture. In this work, intending to find the best-suited YOLO architecture for fast and accurate vehicle detection from traffic images in Bangladesh, we have conducted a performance analysis of different variants of the YOLO-based architectures such as YOLOV3, YOLOV5s, and YOLOV5x. The models were trained on a dataset containing 7390 images belonging to 21 types of vehicles comprising samples from the DhakaAI dataset, the Poribohon-BD dataset, and our self-collected images. After thorough quantitative and qualitative analysis, we found the YOLOV5x variant to be the best-suited model, performing better than YOLOv3 and YOLOv5s models respectively by 7 & 4 percent in mAP, and 12 & 8.5 percent in terms of Accuracy.

translated by 谷歌翻译

BSpell: A CNN-blended BERT Based Bengali Spell Checker

Chowdhury Rafeed Rahman , MD. Hasibur Rahman , Samiha Zakir , Mohammad Rafsan , Mohammed Eunus Ali

分类：自然语言处理

2022-08-20

孟加拉语键入大多是使用英语键盘进行的，并且由于存在化合物和类似明显的字母，因此可能是错误的。拼写错误的单词的拼写校正需要了解单词键入模式以及用法一词的上下文。我们提出了一个专业的BERT模型，Bspell针对词校正句子级别。Bspell包含一个可训练的CNN子模型，名为Semanticnet以及专门的辅助损失。这使得Bspell在存在拼写错误的情况下专门研究高度易转的孟加拉词汇。我们进一步提出了将单词级别和字符水平掩蔽组合的混合预读方案。利用这种预审前的方案，BSPELL在现实生活中的孟加拉语拼写校正验证设置中实现了91.5％的准确性。对两个孟加拉语和一个印地语拼写校正数据集进行了详细比较，显示了拟议的Bspell优于现有咒语检查器的优势。

translated by 谷歌翻译

Paradigm Shift in Language Modeling: Revisiting CNN for Modeling Sanskrit Originated Bengali and Hindi Language

Chowdhury Rafeed Rahman , MD. Hasibur Rahman , Mohammad Rafsan , Samiha Zakir , Mohammed Eunus Ali , Rafsanjani Muhammod

分类：自然语言处理

2021-10-25

虽然为英语和中文等高资源语言（LM）的语言建模（LM）有大量的工作，但对于孟加拉和印地文等低资源语言仍然是未开发的。我们提出了一个名为COCNN的最终可训练记忆高效CNN架构，以处理孟加拉和印地语的高拐点，形态丰富，灵活的单词顺序等特定特征，以及孟加拉和印地语的语音拼写错误。特别是，我们在Word和句子级别介绍了两个学习的卷积子模型，这些子模型结束了最终培训。我们展示了最先进的（SOTA）变压器模型，包括佩尔雷达伯特不一定会给孟加拉和印地语产生最佳表现。 COCNN优于Preverting Bert，参数减少16倍，它可以在多个真实数据集上的SOTA LSTM模型实现更好的性能。这是第一次研究不同架构的有效性，从三个深度学习范式 - 卷积，经常性和变压器神经网络，用于建模两种广泛使用的语言，孟加拉和印地语。

translated by 谷歌翻译

BayesBeat: Reliable Atrial Fibrillation Detection from Noisy Photoplethysmography Data

Sarkar Snigdha Sarathi Das , Subangkar Karmaker Shanto , Masum Rahman , Md. Saiful Islam , Atif Rahman , Mohammad Mehedy Masud , Mohammed Eunus Ali

分类：机器学习

2020-11-02

智能手表或健身追踪器由于负担得起和纵向监测功能而获得了潜在的健康跟踪设备的广泛欢迎。为了进一步扩大其健康跟踪能力，近年来，研究人员开始研究在实时利用光摄影学（PPG）数据中进行心房颤动（AF）检测的可能性，这是一种几乎所有智能手表中广泛使用的廉价传感器。从PPG信号检测AF检测的重大挑战来自智能手表PPG信号中的固有噪声。在本文中，我们提出了一种基于深度学习的新方法，即利用贝叶斯深度学习的力量来准确地从嘈杂的PPG信号中推断出AF风险，同时提供了预测的不确定性估计。在两个公开可用数据集上进行的广泛实验表明，我们提出的方法贝尼斯甲的表现优于现有的最新方法。此外，贝内斯比特（Bayesbeat）的参数比最先进的基线方法要少40-200倍，使其适合在资源约束可穿戴设备中部署。

translated by 谷歌翻译

A Review on Drivers Red Light Running Behaviour Predictions and Technology Based Countermeasures

Md Mostafizur Rahman Komol , Mohammed Elhenawy , Shamsunnahar Yasmin , Mahmoud Masoud , Sebastien Glaser , Andry Rakotonirainy

分类：人工智能 | 神经与进化计算

2020-08-15

在灯号路口闯红灯是一个成长的道路安全问题全球，导致先进的智能交通技术和对策的快速发展。然而，现有的研究还没有总结并提出改进安全技术，这些基于创新的效果。本文代表的闯红灯行为的预测方法和技术为基础的对策进行全面审查。具体来说，本研究的重点是提供有关文献的两个流进行全面审查靶向闯红灯，并在灯号控制路口停时走的行为（1）研究专注于模拟和预测闯红灯和停止-and-go相关驾驶员的行为，（2）侧重于不同的技术为基础的措施，其打击这种不安全行为的有效性研究。这项研究提供了系统的指导，以帮助研究人员和利益相关者了解如何最好地识别闯红灯和停止和去相关的驾驶行为，并随后采取对策，以制止这种危险行为，提高相关的安全。

translated by 谷歌翻译

A Comparison Study of Deep CNN Architecture in Detecting of Pneumonia

Al Mohidur Rahman Porag , Md. Mahedi Hasan , Dr. Md Taimur Ahad

分类：计算机视觉 | 机器学习

2022-12-30

Pneumonia, a respiratory infection brought on by bacteria or viruses, affects a large number of people, especially in developing and impoverished countries where high levels of pollution, unclean living conditions, and overcrowding are frequently observed, along with insufficient medical infrastructure. Pleural effusion, a condition in which fluids fill the lung and complicate breathing, is brought on by pneumonia. Early detection of pneumonia is essential for ensuring curative care and boosting survival rates. The approach most usually used to diagnose pneumonia is chest X-ray imaging. The purpose of this work is to develop a method for the automatic diagnosis of bacterial and viral pneumonia in digital x-ray pictures. This article first presents the authors' technique, and then gives a comprehensive report on recent developments in the field of reliable diagnosis of pneumonia. In this study, here tuned a state-of-the-art deep convolutional neural network to classify plant diseases based on images and tested its performance. Deep learning architecture is compared empirically. VGG19, ResNet with 152v2, Resnext101, Seresnet152, Mobilenettv2, and DenseNet with 201 layers are among the architectures tested. Experiment data consists of two groups, sick and healthy X-ray pictures. To take appropriate action against plant diseases as soon as possible, rapid disease identification models are preferred. DenseNet201 has shown no overfitting or performance degradation in our experiments, and its accuracy tends to increase as the number of epochs increases. Further, DenseNet201 achieves state-of-the-art performance with a significantly a smaller number of parameters and within a reasonable computing time. This architecture outperforms the competition in terms of testing accuracy, scoring 95%. Each architecture was trained using Keras, using Theano as the backend.

translated by 谷歌翻译

Condensed Representation of Machine Learning Data

Rahman Salim Zengin , Volkan Sezer

分类：机器学习

2022-12-29

Training of a Machine Learning model requires sufficient data. The sufficiency of the data is not always about the quantity, but about the relevancy and reduced redundancy. Data-generating processes create massive amounts of data. When used raw, such big data is causing much computational resource utilization. Instead of using the raw data, a proper Condensed Representation can be used instead. Combining K-means, a well-known clustering method, with some correction and refinement facilities a novel Condensed Representation method for Machine Learning applications is introduced. To present the novel method meaningfully and visually, synthetically generated data is employed. It has been shown that by using the condensed representation, instead of the raw data, acceptably accurate model training is possible.

translated by 谷歌翻译

Thermal Heating in ReRAM Crossbar Arrays: Challenges and Solutions

Kamilya Smagulova , Mohammed E. Fouda , Ahmed Eltawil

分类：机器学习

2022-12-28

Increasing popularity of deep-learning-powered applications raises the issue of vulnerability of neural networks to adversarial attacks. In other words, hardly perceptible changes in input data lead to the output error in neural network hindering their utilization in applications that involve decisions with security risks. A number of previous works have already thoroughly evaluated the most commonly used configuration - Convolutional Neural Networks (CNNs) against different types of adversarial attacks. Moreover, recent works demonstrated transferability of the some adversarial examples across different neural network models. This paper studied robustness of the new emerging models such as SpinalNet-based neural networks and Compact Convolutional Transformers (CCT) on image classification problem of CIFAR-10 dataset. Each architecture was tested against four White-box attacks and three Black-box attacks. Unlike VGG and SpinalNet models, attention-based CCT configuration demonstrated large span between strong robustness and vulnerability to adversarial examples. Eventually, the study of transferability between VGG, VGG-inspired SpinalNet and pretrained CCT 7/3x1 models was conducted. It was shown that despite high effectiveness of the attack on the certain individual model, this does not guarantee the transferability to other models.

translated by 谷歌翻译